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The assessment of children in their years before school and their first years of 
school has been, traditionally, informal. Further, assessment of children's 
mathematical skills at this level has been infrequent compared to social, 
emotional and physical assessments. However, there are contexts where 
reliable, valid, standardised data from assessment in mathematics are required. 
This paper outlines the development of two assessment tools for mathematics 
that were originally developed for such contexts. Item Response Theory (IRT) 
analyses enabled the construction of assessment forms that address the range of 
abilities of 4- to 8-year-old children, and provided the scales used for 
constructing formative and summative reports of achievement. A description of 
the development of the assessment tools and the IRT analysis that provides the 
reporting formats are presented together with some research uses of the tools. 


This article describes the development of two mathematics assessment tools 
suitable for use af fhe pre-school level, where formal assessmenf is rare. The 
arficle also describes how fwo issues in classroom assessmenf fhaf challenge 
fhe developmenf of assessmenf fools af fhis level were overcome. These 
issues are: fhe wide range of mafhemafical undersfandings of children of fhis 
early age; and fhe need fo provide reporfing fo feachers fhaf will assisf in 
plarming for appropriafe mafhemafical learning experiences for fhe children 
assessed. 

The purposes of fhis arficle are: fo demonsfrafe fhe possibilifies for 
sfandardised assessmenf in fhe early years; fo show how Ifem Response 
Theory (IRT) analyses can provide reporfing formafs for assisfing early years 
professionals; and fo describe some examples of fwo assessmenf fools in 
research confexfs. While fhere are some necessary differences in fhe defail of 
fhese assessmenf fools, fhe developmenf of one parallels fhe ofher. In some 
of fhe following secfions bofh fools are described separafely, and in ofher 
sections fhe fools are discussed fogefher. When examples are used, fhe source 
assessmenf fool is indicafed. 


Background 

The dominance of consfrucfivisf approaches fo mafhemafics learning in fhe 
early years of schooling has begun fo change perspectives on effective 
practice for young children. The earlier Piagefian notions of sfages of 
developmenf are giving way fo fhe realisafion fhaf effective learning fakes 
place when a rich, supportive environmenf offers challenge and relevance 
(e.g.. Cook, 1996; Doig, McCrae, & Rowe, 2003). Furfher, fhere is a growing 
awareness among early years professionals of fhe wide range of mafhemafical 
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capabilities of children entering pre-school and school (Aubrey, 1997; Bottle, 
1998; Doig et al., 2003; Groves & Cheeseman, 1993; Munn, 1994; Nixon & 
Aldwrnckle, 1997). 

The Australian Council for Educational Research (ACER) conducted a 
study examining the relationship between age of entry to school, school 
structure, curriculum, teacher expectations, and student outcomes in 
language and mathematics (Curriculum and Organisation in the Early Years 
of School, 1997-1999). Eor the purposes of this study, it was necessary to have 
measures of developmental progress that were applicable to children at the 
pre-school level and in the early years of schooling, were easy to administer 
and score, and provided readily interpretable results. The budget for the 
study prohibited the use of individually administered instruments, and it 
was considered imlikely that teacher observations over an extended period 
of time would provide reliable data. As no suitable instruments meeting 
these criteria could be found, it was necessary to develop measures 
specifically for use in the study. The two different tools that were created 
were: the Who Am I? developmental assessment material (de Lemos & Doig, 
1999a, 1999b); and 1 Can Do Maths (Doig & de Lemos, 2000). 

Who Am I? was developed from previous research on the use of copying 
tasks for the assessment of developmental level and school readiness (de 
Lemos, 1973, 1980; de Lemos & Larsen, 1979; de Lemos & Mellor, 1991). This 
work was subsequently used as a basis for developing a measure of school 
readiness based on copying tasks (Larsen, 1987). While other less familiar or 
regular figures could have been included in Who Am I?, Piaget and 
Inhelder's (1956) research linked the stages that they observed in children's 
ability to copy regular geometrical forms to cognitive development. These 
earlier studies indicated that copying tasks were strongly associated with 
subsequent school achievement, and provided a reliable measure of 
development. 

Measures of spontaneous writing were included in Who Am 1? as 
indicators of developmental levels, because there is a link between children's 
early attempts at writing and their growing understanding of the way in 
which spoken sounds are represented by print (Eerreiro & Teberosky 1982). 
The links between this form of writing and emergent literacy is supported by 
the work of Clay (1993) and Harmavy (1993), while the work of Snow, Bums, 
and Griffin (1998) has shown that letter recognition is strongly related to later 
achievement in reading. 

The final task, in Who Am I?, asks children to draw a picture of 
themselves. This well-known developmental task has been used also as a 
measure of developmental level by Brermer (1964), de Lemos (1973), and 
Harris (1963). 

In a similar marmer to Who Am I?, the development of I Can Do Maths 
was influenced by the understanding that children come into pre-school 
and school with a wide range of experiences and understandings fostered 
by parents. Eor example, in an English study of 3- and 4-year-olds' 
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mathematical knowledge prior to pre-school or school, it was found that the 
"children showed considerable knowledge and some consistent patterns of 
responding . . . [and] fhe findings are unlikely fo resulf from children noficing 
fhe numerals unaided and invenfing fheir own ideas abouf whaf fhey mean" 
(Ewers-Rogers & Cowan, 1996, p. 23). Ofher examples include fhose from fhe 
work of Gelman and Gallisfel (1978), who reporfed fhaf "children as young 
as fwo years can accurafely judge numerosify provided fhaf fhe numerosify 
is nof larger fhan fwo or fhree" (p. 55), and Zill, Gollins, Wesf, and Hausken 
(1995) who found fhaf children of ages 3 fo 5 had a wide range of 
mafhemafical skills and urged pre-school feachers fo mainfain children's 
engagemenf fo furfher develop fhese skills and undersfandings. 

Research shows fhaf children make greaf progress in ferms of 
curriculum confenf during fheir firsf year af school. Suggafe, Aubrey, and 
Peffiff (1997) fesfed children on rofe counting, counting objecfs, and reading, 
wrifing and ordering numbers. Tymms, Merrell, and Henderson's (1997) 
sfudy of children's developmenf during fhe firsf year of school also showed 
a "massive difference fo fhe affainmenf of pupils in Reading and Mafhs" 
(p. 117), affer allowing for pupil background facfors. Sfewarf, Wrighf, and 
Gould's (1998) sfudy showed fhaf "progress [in mafhemafics] was made by 
fhe majorify of sfudenfs and syllabus expecfafions were nof only reached 
buf exceeded by many of fhese sfudenfs" (p. 562). 

Alfhough some earlier experimenfafion had shown fhaf young children 
can cope wifh written response formafs (Doig, 1995), some of fhe children in 
fhe Curriculum and Organisation in fhe Early Years of School, 1997-1999 
projecf were very young (3 years of age), and if was decided fhaf questions 
be presenfed orally fo reduce fhe reading and wrifing loads on fhe children. 
Ifem confenf was based on fhe confenf of fhe nafional profiles in mafhemafics 
(Ausfralian Education Council, 1994) in which fhe early levels focus on 
concepfs and skills in Number, Measuremenf, Chance and Dafa, and Space. 

Group adminisfrafion of fhe assessmenf ifems was used fo reduce fhe 
time required for adminisfrafion, alfhough fhis meanf fhaf children would 
need fo record fheir own responses in some way. Eurfher, fwo differenf 
assessmenf forms were used af differenf year levels fo shorfen fhe time 
required of fhe children, and fo provide fhe mosf appropriafe sef of 
questions. 

In all, a sef of 150 questions was consfrucfed from which a final sef of 47 
ifems was selecfed for fhe published version of I Gan Do Mafhs. This sef was 
broken info fwo sub-sefs, wifh fhe second sef confaining some harder ifems 
fhaf were only admrnisfered fo children in fheir second and fhird year of 
school. The identification of fhese harder ifems was defermined in discussion 
wifh early years practitioners. 

As wifh Who am I?, fhe I Gan Do Mafhs ifems are adminisfered orally in 
a lock-sfep fashion; fhaf is, all children worked on fhe same quesfion af fhe 
same fime, and advanced fhrough fhe quesfions af fhe same pace. 
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These questions were in two formats: either they had a disguised 
multiple-choice response format, or they asked for a simple, written, 
numerical response. Figures 1 and 2 show fhe fwo differenf quesfion formafs. 


Put a ✓ on the tallest tree. 



Count how many fish there are. 



Write the number offish there are. 

Figure 2. A quesfion requiring a wriffen, numerical response. 


Reporting Requirements 

As fhe achievemenfs of children in fhe early years would be useful fo early 
years professionals, reporfing fhe resulfs of assessmenf in a clear and 
comprehensible manner was of paramounf imporfance. If was decided fhaf 
fhree differenf reporfs would be provided: a normative reporf, showing how 
children assessed were placed wifh respecf fo ofher children of fhaf age, or in 
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that year of schooling; a report that presented diagnostic information for 
professional use; and, finally, a descripfive reporf for parenfs. 

The range of reporfs envisaged for bofh assessmenf fools suggesfed fhaf 
an IRT analysis would be more fruifful fhan fradifional approaches in fhaf if 
would enable, by using a Rasch analysis (Rasch, 1960): fhe use of ramped 
questions and developmenfal scoring for fhe youngesf children; fhe use of 
equated forms in fhe dafa collection; fhe esfablishmenf of developmenfal 
scales fhaf would frace children's progress across fhe age group in fhe projecf 
sample; and fhe provision of formafive (diagnostic) reporfs fo feachers (Doig, 
1992), and descripfive reporfs fo parenfs. 

Data Collection 

The dafa for fhe developmenf of Who Am 1? and 1 Can Do Mafhs were 
collecfed from a sample of pre-schools, schools, and children from across 
Ausfralia. The children attended a fofal of 84 schools and 47 pre-schools, 
including some attached fo primary schools. These sifes were selecfed af 
random from all sfafes and ferrifories, wifh fhe excepfion of Tasmania. While 
nof proportionally represenfafive in ferms of sfafe, fhe sample covered a 
wide range of sifes fhroughouf Ausfralia. From each of fhe parficipafing pre- 
schools and schools, one class af each of fhe relevanf year levels (pre-school 
fo Year 2) was selecfed. This provided a fofal sample of over 4000 children, 
wifh abouf 900 children af each of fhe pre-school and pre-Year 1 levels, and 
abouf 1200 children af each of fhe Year 1 and Year 2 levels. 

Who Am I? 

Data analysis 

Children's responses to the Who Am I? tasks were sorted into a series of 
cafegories, esfablished on fhe basis of acfual responses, fhaf is, like responses 
were puf fogefher. These cafegories were ordered by reference fo expecfed 
developmenfal progression as suggesfed by fhe research liferafure. This 
same liferafure was also used fo develop fhe scoring criferia. The process 
was repeafed for each Who Am I? fask. See Adams, Doig, and Rosier (1991) 
for anofher example of fhese processes being used for cafegorising free- 
response dafa. 

Responses, once cafegorised, were analysed using Masfers' (1982) Partial 
Credif Model fhaf provides esfimafes of fhe abilify needed fo achieve fhaf 
cafegory of response. Thaf is, if is nof assumed fhaf all questions are of equal 
difficulfy nor fhaf fhe achievemenf cafegories form a sef of "sfeps" fhaf 
require fhe same amounf of developmenf fo achieve fhem. For anofher 
example of scoring and analysis of responses fhaf views response cafegories 
as parfly correcf, see also Tapping Sfudenfs' Science Beliefs (Adams ef al., 
1991; Doig & Adams, 1993). 

The Partial Credif Model form of analysis provides a probabilisfic 
relafionship fhaf places children's abilify and fhe cafegory difficulfy on fhe 
same scale (see Bond & Fox, 2001, for an explanation of Rasch scales). In 
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addition, questions were grouped into four sub-scales: Copying (circle, cross, 
square, triangle, and diamond); Symbols (name, numbers, letters, words, 
sentence); Drawing; and Total (the total of fhe ofher fhree scales). 

Because of fhe possibilify of bias in inferprefing responses fo Who Am I? 
fasks, a sense of fhe infer-rafer reliabilify was required fo indicafe fhe 
consisfency wifh which fhe same resulf would be obfained if fhe child's 
response was scored by differenf people. To obfain fhis, fhe same sef of 30 
booklefs was marked by 21 differenf rafers, all of whom were experienced 
feachers. The resulf s of fhis exercise indicafed a safisfacfory level of 
agreemenf between fhe differenf rafers wifh no more fhan one score cafegory 
difference on any fask across fhe group of rafers. 

The partial credif analysis provided an esfimafe of reliabilify for Who Am 
I? of 0.91, indicating a high level of infernal consisfency for fhe fasks. Due fo 
fhe somewhaf novel nafure of fhe Who Am I? assessmenf fool, some care was 
faken fo ensure fhaf Who Am I? was valid wifh respecf fo ifs confenf and 
consfrucf. In Who Am I?, fasks focus on aspecfs of children's developmenf 
fhaf are direcfly relafed fo fhe objecfives of fhe early years of school 
curriculum. Dafa from fhe sample provided correlafions of abouf 0.6 between 
scores on Who Am I? and scores on fhe Liferacy Baseline (Vrncenf, Grumpier, 
& Easf London Assessmenf Group, 1996), which are similar fo ofher reporfed 
correlafions between developmenfal measures (Tymms, 1999). 

Table 1 

Percentage of Children Achieving Highest Level on Who Am I? Tasks, by School 
Level (de Lemos & Doig, 1999a, p. 22). (Reproduced by permission of the Australian 
Council for Educational Research Ltd.) 


Task 

Pre-school 

and Pre- 
primary 
Mean age 
= 4:11 

Pre-Year 1 

Mean age 
= 5:9 

Year 1 
(QLD 
& WA) 
Mean age 
= 6:1 

Year 1 
(Other 
states) 
Mean age 
= 6:9 

Year 2 
(QLD 
& WA) 
Mean age 
= 7:1 

Year 2 
(Other 
states) 
Mean age 
= 7:8 

Name 

21 

63 

81 

92 

97 

99 

Diamond 

5 

17 

36 

41 

74 

65 

Numbers 

3 

30 

60 

85 

98 

97 

Letters 

12 

66 

67 

94 

95 

98 

Words 

<1 

20 

34 

71 

86 

90 

Sentence 

<1 

11 

26 

58 

83 

84 

Drawing 

2 

3 

4 

12 

33 

35 

Total N 

866 

915 

411 

924 

311 

888 
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Construct validity is based on an accumulation of evidence relating to 
the assessment and what it measures. Such evidence would include 
correlations with other relevant measures, studies of changes in performance 
over fime and, parficularly in fhe case of a developmenfal measure, wifh 
increasing age. In fhis case, evidence of developmenfal progression is 
indicafed by fhe increase in mean score according fo bofh age and school 
level, as indicafed in Table 1, fhaf shows fhe proporfion of children by school 
level who achieve fhe highesf level on some of fhe key Who Am I? fasks. For 
example, fhe Name fask shows a subsfanfial increase in higher performance 
once children enfer school (21% in fhe highesf score cafegory af pre-school 
level and 63% af pre-Year 1 level). Again, fhe percenfage of children in fhe 
highesf score cafegory on fhe Diamond fask increases on enfry fo school, and 
confinues fo increase wifh more schooling. The change befween Year 1 and 
Year 2, for children nof in Wesfern Ausfralia or Queensland, shows a smaller 
increase (41% fo 65%) fhan fhe increase for Wesfern Ausfralia or Queensland 
(36% fo 74%) over fhe same fwo-year period. This difference may be 
accounfed for by curriculum differences befween fhe sfafes. Similarly, fhe 
large difference on fhe Words fask befween Queensland and Wesfern 
Ausfralian Year 1 sfudenfs (34%) and ofher sfafes (71%) is likely fo be due fo 
fhe exfra year of schooling for fhe laffer group. 

I Can Do Maths 

Data collection 

The final sef of 47 questions for I Can Do Mafhs was selecfed on fhe basis of 
providing fhe widesf coverage of fhe curriculum confenf, and question 
difficulfy. Differenf assessmenf forms were used af differenf year levels fo 
shorfen fhe fime required of fhe children, and fo provide fhe mosf 
appropriafe sef of quesfions. The second (harder) sef was adminisfered only 
fo children in fheir second and fhird year of school. 

Data analysis 

Responses to the I Can Do Maths assessment questions were scored as 
correct or incorrect and were analysed using an IRT Rasch analysis (Rasch, 
1960; Wright & Stone, 1979) that gives estimates of fhe abilify required of fhe 
child fo obfain a parficular Tofal Score. This analysis provides a probabilisfic 
relafionship fhaf places children's abilify and fhe question difficulfy on fhe 
same inferval scale, fhus allowing direcf comparisons befween differenf raw 
scores in ferms of fhe abilify. As parf of fhe analysis, quesfions were grouped 
info sub-scales (Number, Measuremenf, and Space). These scales are fhe 
basis of fhe reporfs for I Can Do Mafhs. 

For fhe developmenf of fhe published version of I Can Do Mafhs (Doig 
& de Lemos, 2000), fhe 47 projecf quesfions were made info fwo sefs — Level 
A (30 quesfions) and Level B (33 quesfions) — and prinfed in separafe 
booklefs. The fwo booklefs have some quesfions in common, and fhese 
common or link quesfions allow children's abilify esfimafes fo be placed on 
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a single scale, and thus allow their mathematical development to be 
monitored over the age range 3 to 9 years (see Kolen, 1999, for an explanation 
of test equating). 


Reporting Results of Assessment 

While question development is an important aspect of assessment 
construction, the framework for reporting children's performances is equally 
critical. In the case of assessment tools such as Who Am I? and I Can Do 
Maths, the use of IRT analyses of the data provided opportimities for 
reporting both summative and formative (diagnostic) results. For Who Am 
1?, two forms of report are provided: an Individual Profile fhat contains both 
summative and normative information, and a diagnostic map, or DIAMAP 
(Doig, 1992). 

For I Can Do Maths there are three forms of report provided: diagnostic, 
descriptive, and normative. These reports are based on the underlying scales 
constructed by the Quest Rasch analysis (Adams & Khoo, 1993, 1996) of the 
original project data. The project data for questions included in Level A 
provided the scales for Level A (for children up to the beginning of fheir 
second year at school) and for Level B (for children in their second year of 
schooling). 

Diagnostic Reports 

A parficular feature of fhe scales consfructed using Rasch analysis is fhat 
the likelihood of a particular response to a question can be calculated for a 
child with a specific Total Score. This enables a DIAMAP to be constructed 
(Doig, 1992). 

In a DIAMAP, questions that lie below children's Total Score lines are 
expected to be easy for them, while those above this line are expected to be 
too difficult. In other words, the further a question is below the line, the more 
likely it is that it will be answered correctly, and the further above the line, 
the less likely it is that it will be answered correctly. 

To use a DIAMAP, a line is drawn across it at the child's Total Score level. 
A circle is then drawn around each assessment question that the child 
answered correctly. Some of fhese marked tasks may be above the child's 
Total Score line and some below it. Once the Total Score line has been drawn 
and each correctly answered question marked, there are four conditions of 
diagnostic interest: tasks expected to be correct that are correct, tasks expected 
to be correct that are not correct, tasks not expected to be correct that are 
correct, and tasks not expected to be correct that are not correct. There are two 
conditions above the child's score line and two below it, and in each section 
of the DIAMAP there is an expected and an unexpected condition. 

A child's specific sfrengfhs are shown by the correctly answered 
questions above the child's Total Score line. When these questions are within 
a particular curriculum or topic area, such as number or addition, then this 
may indicate strength in that area. Individual questions, on the other hand. 
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may reveal particular strengths within an area: for example, a particular 
strength in simple addition within Number, but not strength in other aspects 
of Number. 

Specific weaknesses are shown by fhe quesfions answered incorrecfly 
lying below a child's Tofal Score line. Again, when fhese quesfions are wifhin 
a parficular curriculum or topic area, fhis may indicafe a general weakness. 
Individual quesfions, on fhe ofher hand, may reveal a parficular weakness 
wifhin an area: for example, a parficular weakness wifh shape recognition 
wifhin Space, buf nof in Space generally. 

The DIAMAP shown in Figure 3 is for 6-year-old Daryl, who was 
assessed using fhe Level A assessmenf form mid-year in his firsf year of 
school in fhe Norfhern Territory Daryl's Tofal Score was 18. The circled 
question numbers on fhe DIAMAP show which quesfions he answered 
correcfly. The DIAMAP score line shows a reasonably clear division between 
quesfions fhaf he could and could nof answer successfully, which is expecfed 
in a DIAMAP. 

Of fhe Number quesfions below fhe score line (expecfed to be easy for 
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Figure 3. An example of an I Can Do Mafhs DIAMAP (Doig & de Lemos, 
2000, p. 17). 


(Reproduced by permission of the Australian Council /or Educational Research Ltd.) 
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Daryl), all are correct except question 11 (add 5 and 4) and question 18 
(identify the number 65). Daryl's Measurement successes lie below his score 
line as expected. Daryl's achievement on spatial questions is very good, and 
well beyond what is expected of a child wifh his overall score. The overall 
picture is of a competent child in counting and measurement, but not yet 
familiar wifh fhe convenfions of mafhemafics fhat enable success with 
formal work. For a child af mid-year in the first year of school, if is likely fhat 
he has not as yet been exposed to the formal aspecfs of fhe curriculum. 

Daryl's DIAMAP alerfs us fo a possible problem wifh firsf and lasf 
(questions 20 and 21), but more importantly shows that a reliance on a score 
alone would not provide a complete picture of Daryl's abilities, nor provide 
suggestions for future learning experiences. 

The Diagnostic Map for Who Am 1? is interprefed as thaf for I Can Do 
Maths, except that the questions are scored at several levels rather than 
simply correct or incorrect. In Figure 4, each highest level of response has 
been circled and a line drawn across fhe Diagnostic Map at Ronnie's Total 
Score (22). This line divides the assessment levels (scores) on the Who Am I? 
tasks into those he is expected to achieve (below his Total Score line) and 
those he is not expected to achieve (above his Total Score line). 

As can be seen in Figure 4, most of Ronnie's highesf assessmenf levels 



Figure 4. An example of a Who Am I? DIAMAP (de Lemos & Doig, 1999a, 

p. 16). 


(Reproduced by permission of the Australian Council /or Educational Research Ltd.) 
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are for the Copying tasks. The high assessment at Level 4 for the Circle and 
Name tasks is in contrast to Ronnie's results for the Symbols tasks (Numbers 
- Level 1, Letters - Level 1, Words and Sentence - Level 0). 

Individual Reports 

Figure 5 shows an Individual Profile for reporting the results of Who Am I? 
The interpretation of this report is the same as for any normative report. 
These normative comparisons provide a guide to the performance expected 
of children and allow for variation in performance between children 
stemming from individual differences. These norms also allow teachers to 
determine where each child is in relation to other children, as a basis for 
grouping children for different t 5 ^es of activities. The shaded bands show 
the expected range of scores for the middle 80% of children in each of the 
distinct state school structures across Australia. 

Maryanne's score on the Symbols scale (9) shows that her development 
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Figure 5. An example of a Who Am 1? Individual Profile (de Lemos & Doig, 
1999a, p. 13). 

(Reproduced by permission of the Australian Council /or Educational Research Ltd.) 


in this area is below that expected of children at her stage of schooling. Her 
Copying and Drawing scores (14 and 3 respectively) place her just within the 
expected range for a Preparatory year child in Victoria and her Total Score 
(26) places her below the expected level for children on entry to school. 

An I Can Do Maths individual report is interpreted in a similar manner 
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(see Doig & de Lemos, 2000, for examples of I Can Do Mafhs individual 
reporfs). 

Descriptive Reports 

The inferprefafion of an I Can Do Mafhs Descriptive Reporf, as shown in 
Figure 6, parallels fhaf of a DIAMAP insofar as descripfions of performance 
below fhe Tofal Score line are likely fo have been achieved, while fhose 
descripfions of performance above fhe line are yef fo be achieved. Nofe fhaf 
where fhere is more fhan one descripfion of abilify given, fhe higher 
descripfion on fhe scale subsumes lower descripfions. The example in Figure 
6 is for Daryl, in his firsf year of school. 


r cin £fb rtuths htporl hir 


Dory/ 




Figure 6. An example of an I Can Do Mafhs Descriptive Reporf. (Doig & de 
Lemos, 2000, p. 19). 

(Reproduced by permission of the Australian Council /or Educational Research Ltd.) 


Other Uses of These Assessment Tools 

Bofh Who Am I? and I Can Do Mafhs have, in a sense, come back fo fheir 
origins. These assessmenf fools sfarfed as insfrumenfs for use in a research 
projecf, were refined and developed info classroom assessmenf fools, and 
have now been employed for dafa collecfion in bofh Ausfralian and overseas 
research sfudies. Some of fhese sfudies are described below. 
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Comparative Research 

Who Am I? has been used to compare pre-school children's development in 
three different cultural groups: Chinese children in Hong Kong, Anglo- 
Indian children in India, and Australian Indigenous children. The results of 
fhis series of sfudies are given affer the description of each sample group (de 
Lemos, 2002; de Lemos & Doig, 2000). 

In fhe sfudy of pre-school development in Hong Kong, the children were 
of Chinese origin and had Cantonese as fheir mofher fongue, alfhough wifh 
varying levels of English proficiency. The sample of 60 children was between 
3 and 6 years of age, and af differenf levels of pre-school. 

The adminisfration of Who Am I? was underfaken by the children's pre- 
school teacher. Children's responses were in a mixture of Canfonese and 
English and scripfs were scored by a native speaker of Chinese, with 
adaptations for Chinese script. That is, the original scoring criteria were 
maintained by, for example, equaling English letters wifh simple Chinese 
characters, and English words with complex Chinese characters. 

The Indian sample consisted of 249 children, from Himachal Pradesh 
(near fhe border of Tibef), and were from mainly rural communities with 
Hindi as their mother tongue. The children's ages were from 4 years to over 
10 years, wifh mosf aged befween 5 and 6 years. Who Am I? was 
administered in Hindi, imder the guidance of the National Institute of 
Educational Harming, New Delhi. The children responded in Hindi, and as 
with the Hong Kong group, adaptations were made for language variafions. 

Ausfralian Indigenous children were adminisfered Who Am I? af fwo 
sites: a pre-school and a primary school. Children were assessed at the end 
of fheir pre-school year and at the end of fheir firsf year at school. As well. 
Who Am I? was administered to a sample of 523 urban Australian pre-school 
and first year of school children. All of fhese Australian children were 
administered Who Am I? in English. 

A summary of the results of fhese adminisfrations for children af a 
similar age and level of schooling is given in Table 2. 


Table 2 

Comparison between International and National Groups On Two Who Am I? 
Scales (adapted from de Lemos & Doig, 2000) 


Group 

N 

Mean 

age 

Copying 

mean 

Symbols 

mean 

Total 

mean 

Hong Kong 

19 

6.0 

17.7 

18.5 

39.2 

India 

249 

5.9 

13.3 

10.7 

25.9 

Indigenous Australian 

60 

5.5 

14.0 

8.8 

25.9 

Urban Australian 

523 

5.9 

15.8 

15.3 

33.9 
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A comparison of the Total Score means displays the differences between 
groups from rural (India and Indigenous Australian) and urban (Australia 
and Hong Kong) areas. However, it is clear that there appears to be a 
substantive link between performance on 'copying' tasks and 'symbols' tasks 
for Hong Kong and Australian children, but not so clear for the rural groups 
(India and Indigenous Australian), although questions about familiarity and, 
or, exposure to print for these two groups may arise. More complete details 
of these studies can be found in de Lemos and Doig (2000), and de Lemos 
( 2002 ). 

The National Longitudinal Survey of Children and Youth 
Canadian data on Who Am I? have been collected from over seven hundred 
5- and 6-year-olds at pre-school level as part of the North York Community 
Project in Ontario. The North York study is, in turn, part of the broader 
Canadian National Longitudinal Survey of Children and Youth that is being 
conducted by Statistics Canada and the federal government department of 
Human Resources Development Canada. This national study is following 
the development of children from birth to early adulthood, and includes 
both health and educational aspects. The study is designed to monitor the 
impact of factors that influence children's social, emotional, and behavioural 
development (Statistics Canada, 2004). 

Initial data from the North York Community Project indicated that 
Canadian children's performance was similar to that of Australian children 
at a comparable level of schooling (pre-Year 1). Further data were collected 
from a sample of 12,000 4- and 5-year-olds and these suggested that Who 
Am I? scores are less sensitive to differences in a child's home language than 
the Peabody Picture Vocabulary Test that was administered at the same time 
as Who Am I? (North York Early Years Action Group, 2000). A more 
extensive report on the continuing Canadian use of Who Am I? is found in 
de Lemos (2002). 

Project Good Start 

Project Good Start (Doig & Rowe, 2002) had as one of its aims to raise the 
awareness of educators and the general community to the considerable 
achievements of young children (Doig et al., 2003). This on-going study is 
examining the development of the mathematical skills and understandings 
of some 3000 children in their year-before-school and during their early years 
of school. 

Thomson (2004) reported that most children in the project were attaining 
Level 3 or Level 4 on the majority of the copying tasks in Who Am I?, 
although there were significant gender differences. For example, the item 
asking children to draw a circle, 73% of boys and 86% of girls were assessed 
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at Level 3 or 4 and for a similar item asking children to draw a triangle, 48% 
of boys and 60% of girls reached level 3 or 4. Children's performances on 
I Can Do Maths surprised pre-school staff and parents alike. Results on I Can 
Do Maths showed that 40% of the children in the year prior to entering 
school were able to identify a cylinder, and that 30% could solve "4 more 
than 5" and "2 less than 6" (Peck, 2003). Further findings from this project 
are reported in Thomson (2004). 

Re-analysis of I Can Do Maths data 

In a further use of I Can Do Maths, the data used in its development were re- 
analysed to present a picture of children's development in the areas of 
Number, Measurement, and Space. Doig and de Lemos (2000, 2003) argued 
that the hops, steps, and jumps in the development of both individuals and 
groups as they grow as mathematicians can be either hurdles or gateways 
to development. The re-analysis demonstrated that good assessment, 
combined with good analysis, might reveal more than simply children's 
achievement of curriculum content. In the re-analysis, the I Can Do Maths 
questions were put in order of their Rasch difficulty estimates (see Doig & de 
Lemos, 2000, 2003, for details of the original Rasch analysis). The percentage 
of children responding correctly to each item was then plotted against the 
corresponding item. These percentages revealed that, as one would expect, 
the proportion of children responding correctly increased as they proceed 
through school. Figure 7 gives the details of the Space re-analysis by Doig 
and de Lemos (2003). 



Figure 7. The Space questions from I Can Do Maths, in order of difficulty 
(Doig & de Lemos, 2003, p. 276). 


There appeared to be little difference in children's learning trajectories, 
that is, the harder questions were difficult for all, not just children in the 
lower year levels. The most difficult question involved distinguishing left 
from right (Question 9). 
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This similarity across year levels raises the issue of appropriateness of 
curriculum confenf. For all buf fwo quesfions, children's performances 
befween pre-Year 1 and Year 2 differed liffle, suggesfing an under-esfimafion 
of younger children's spafial abilifies. Again fhis appeared fo be fhe case for 
Measuremenf. The re-analysis implies fhaf ifems dealing wifh comparisons 
of affribufes such as lengfh or area show liffle difference in correcf response 
levels across fhe early years of school, suggesfing fhaf children are nof being 
challenged by fhe curriculum. 

In Number, foo, fhere appeared fo be liffle difference befween fhe year 
levels for a number of curriculum aspecfs. Differences did exisf, however, 
befween children in fhe lafer years of schooling, once more formal arifhmefic 
quesfions appeared. For example, fhe question "Tom had 5 gum-nufs and 
found 4 more. How many does he have now?" produced abouf a 10% 
difference befween fhe earlier and lafer year levels. The Doig and de Lemos 
(2003) re-analysis idenfified clearly fhaf fhere exisf some apparenf problems 
wifh curriculum expecfafions vis-a-vis children's abilifies. 

Discussion 

The purposes of fhis arficle were: fo demonsfrafe fhe possibilifies for 
sfandardised assessmenf in fhe early years; fo show how IRT analyses can 
provide reporting formafs for assisfing early years professionals; and fo 
describe some examples of fhese assessmenf fools in research confexfs. 

In fhis arficle, descriptions were provided of fhe developmenf of fwo 
mafhemafics assessmenf fools designed for use af fhe pre-school and early 
years of school levels, where formal assessmenf is rare. In Ausfralia, af fhe 
presenf time, fhere is widespread emphasis on inferviewing children as fhey 
enfer school, wifh a view fo providing appropriafe learning experiences for 
individual children. The seminal work of Wrighf and his colleagues (e.g., 
Wrighf, 1991, 1994, 1999; Wrighf, Marfland, & Sfafford, 2000) and fhe 
subsequenf developmenf of fhe Counf Me In suife of early and lafer years 
programmes, is an oufsfanding example of fhe clinical inferview genre in 
assessmenf. Clinical inferview fechniques have been used and promofed for 
many years as fhe besf way of assessing children's mafhemafical abilifies, a 
view wifh which fhe aufhor agrees (Hunting & Doig, 1997). As bofh Counf 
Me In and fhe Early Numeracy Research Projecf (Clarke, Sullivan, 
Cheeseman, & Clarke, 2000) have shown, fhis approach can have profound 
effecfs on bofh feaching and learning in fhe early years. 

However, fhe time needed for inferviewing means fhaf programmes 
failored fo children's needs are nof commenced for some fime affer fhe 
inferview has faken place, which can make fhe inferview dafa ouf-of-dafe due 
fo fhe rapid developmenf of children in fheir firsf year of school. Thus, fhe 
necessify for a group assessmenf is obvious if one considers fhaf fhe planning 
and implemenfafion of appropriafe learning experiences should occur close 
fo fhe fime of assessmenf. The fwo examples discussed in fhis paper show 
fhaf for very young children if is possible, fhrough oral adminisfrafion, fo use 
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more formal, standardised forms of assessment. This raises the possibility of 
large-scale studies of children's mathematical development, not possible with 
interview or one-on-one methodologies. The Canadian longitudinal study, for 
example, is a case in point. Cross-national studies provide valuable insights 
into our own, and other's, educational practices, and the use of standardised 
tools are a necessity in this form of research. 

The cross-cultural validity of Who Am I? has been, in part, confirmed by 
the Canadian longitudinal survey through comparisons with the Peabody 
Picture Vocabulary Test. Who Am I? shows little bias due to language factors. 
Similarly, the Hong Kong and Indian research demonstrate that Who Am I? 
can be adapted to non-Roman alphabets and ideographic writing. This 
appears to be unique in mathematics assessment at any level. 

The Who Am I? and I Can Do Maths forms of assessment combined with 
Rasch scaling have three advantages over other approaches to assessment. 
First, they allow monitoring across years of development, the reason for the 
use of I Can Do Maths in Project Good Start, where this feature allows 
researchers to track the development of children as they pass from pre-school 
to the early years of school. Second, they enable the construction of reports 
that provide both summative and formative information. The examples of 
reports in this paper clearly show how appropriate analysis can be effective 
in informing educators about the strengths and weaknesses of the students 
across a range of years. Finally, the re-formatting of I Can Do Maths data, as 
illustrated by Doig and de Lemos (2003), demonstrates the power of the 
Rasch scale that defines a difficulty for each question for the entire sample 
age range. 


Conclusion 

The use of these assessment tools by researchers internationally suggests that 
there exists a need for standardised, early years mathematics assessments. 
While not all early years professionals need or want such tools for their 
particular contexts, there are others whose interests lie in mapping children's 
mathematical abilities. Quality assessment tools provide a means of 
achieving a mapping over time, place, or culture. Further, tools such as these 
provide a language for discussion about contemporary issues, such as the 
pre-school to school transition, which benefits practitioners and researchers 
and which, in turn, should benefit the children we serve. 
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